Credit Risk Modelling Platform thumbnail

Credit Risk Modelling Platform

TL;DR

End-to-end credit risk system on 30,000 records. Champion/challenger architecture: WOE scorecard vs XGBoost — both tracked via MLflow. Monte Carlo simulation (500k scenarios) for VaR & CVaR. Three-scenario stress testing (Base / Mild / Severe). Fully deployed FastAPI backend + interactive HTML dashboard. IFRS 9 and Basel III methodologies applied.

30,000 Records
AUC Tracked via MLflow
500k Monte Carlo Scenarios
Live on Render
IFRS 9 / Basel III
Python FastAPI Machine Learning XGBoost Logistic Regression scorecardpy Monte Carlo MLflow DVC HTML/CSS/JS

Project Overview

The Credit Risk Modelling Platform is an end-to-end credit risk analytics system built on the UCI Taiwan Credit Card Default dataset (30,000 records). Designed to meet IFRS 9 Expected Credit Loss (ECL) and Basel III stress-testing requirements, it implements a champion/challenger model architecture — a WOE-based Logistic Regression scorecard as the champion and an XGBoost classifier as the challenger — enabling real-time comparison of interpretable and ensemble-based risk signals.

The platform exposes a FastAPI backend with a business-friendly input layer (10 plain-language fields abstracting 21 raw features) and a fully interactive HTML/CSS/JavaScript frontend covering prediction, portfolio analytics, Monte Carlo simulation, stress testing, and sensitivity analysis. Experiment tracking is handled by MLflow and the training pipeline is managed via DVC with a params.yaml configuration file.

Problem Statement

Financial institutions face significant uncertainty in credit decision-making. Without a structured, data-driven approach, lenders rely on subjective judgment — leading to inconsistent approvals, under-provisioned capital buffers, and regulatory non-compliance. Key challenges include:

  • No consistent methodology for evaluating borrower default risk across a portfolio.
  • Inability to quantify portfolio-level exposure — lenders cannot see total expected losses or how losses concentrate in tail scenarios.
  • Poor stress resilience visibility — banks cannot answer "what happens to our losses if the economy contracts by 30%?" without manual, one-off analyses.
  • Regulatory pressure — IFRS 9 mandates ECL reporting; Basel III requires stress-tested capital adequacy — both are difficult to produce without an automated system.
  • Model opacity — black-box models cannot explain a credit decision to a borrower or regulator, creating legal and reputational risk.

This platform solves all five problems: it produces consistent, explainable credit scores; quantifies ECL across the full portfolio; stress-tests losses under adverse scenarios; and maintains a challenger model for ongoing performance benchmarking — all through a single unified system.

Key Insights

  • Champion/Challenger architecture allows continuous model benchmarking — the scorecard provides regulatory-friendly explainability while XGBoost maximises predictive accuracy.
  • WOE transformation enforces monotonic risk relationships and produces an interpretable credit score in the 576–906 range, with score bands mapping directly to lending decisions.
  • Business input layer abstracts 21 raw dataset columns into 10 plain-language fields — non-technical users never interact with raw model features.
  • Monte Carlo simulation (up to 500,000 scenarios) quantifies tail risk metrics — Value at Risk (VaR) and Conditional VaR (CVaR) — that deterministic ECL alone cannot capture.
  • Three-scenario stress testing (Base, Mild +30% PD, Severe +80% PD) lets risk managers evaluate capital adequacy before adverse conditions materialise.
  • Sensitivity analysis reveals that LGD has an elasticity of ~2.2× relative to PD, meaning recovery strategy improvements outperform credit selection tightening dollar-for-dollar.

Technical Implementation

  • Model Architecture:
    • Scorecard: raw data → feature engineering (15+ derived features) → WOE binning via scorecardpy → Logistic Regression → additive points table → credit score.
    • Challenger: same engineered features → sklearn Pipeline (OrdinalEncoder + XGBClassifier with scale_pos_weight=3.52 for class imbalance) → probability output.
    • Feature selection removed collinear, legally sensitive, and negative-coefficient columns; manual WOE bin breaks enforced monotonicity for the UTILIZATION feature.
  • Training Pipeline (mlops/):
    • Reproducible end-to-end pipeline: load → clean → engineer → select → WOE bin → two-pass LR (first pass surfaces negative coefficients) → final LR + scorecard table → XGBoost with 5-fold CV.
    • Hyperparameters versioned in params.yaml; all runs logged to MLflow including AUC, KS statistic, Gini coefficient, confusion matrix, and ROC curve artifacts.
  • API Layer (FastAPI):
    • Six route groups: /predict, /predict/business, /ecl, /simulate, /stress-test, /sensitivity.
    • Pydantic schemas enforce input validation; CORS middleware allows the standalone frontend to call the API without a proxy.
    • Business input routes accept 10 plain-language fields and internally call input_mapper.py to reconstruct all 21 raw dataset columns before inference.
  • Risk Analytics Services:
    • ECL service — computes PD × LGD × EAD per borrower with optional segment-level breakdown; returns individual and portfolio totals.
    • Monte Carlo service — vectorised NumPy simulation producing Expected Loss, Unexpected Loss, VaR, CVaR, min/max, and a 200-point loss distribution sample for charting.
    • Stress testing service — applies PD multipliers and LGD overrides from risk_config.py; optionally overlays Monte Carlo on each scenario.
    • Sensitivity service — sweeps relative PD shifts and absolute LGD shifts, returning ECL change percentage and elasticity at each point.
  • Frontend Dashboard (HTML/CSS/JS):
    • Six pages: Dashboard, Prediction, Risk Analytics, Simulation, Stress Test, Sensitivity — all sharing a unified dark-themed design system.
    • Stress test results rendered via Chart.js bar chart; simulation results display VaR / CVaR metrics in a responsive grid layout.

Video Preview

Key Learnings

  • Scorecard development requires two LR passes — the first surfaces negative coefficients that violate monotonicity; removing them before the second pass is standard industry practice, not optional cleanup.
  • WOE binning is sensitive to auto-generated boundaries; manual breaks are sometimes necessary to enforce the risk ordering regulators expect.
  • Separating the business input layer from raw model features is architecturally essential — it decouples frontend UX from model internals and makes the API safe for non-technical integrations.
  • Monte Carlo simulation reveals tail risk that deterministic ECL masks completely — two portfolios with identical ECL can have very different VaR profiles depending on PD distribution shape.
  • MLflow experiment tracking becomes indispensable the moment you run more than a handful of training experiments; reproducing a specific run without it is extremely difficult.
  • Regulatory frameworks (IFRS 9, Basel III) are not abstract — building to their requirements from the start (ECL methodology, stress scenario definitions, model documentation) is far cheaper than retrofitting compliance later.

Future Work

  • Add a model card with fairness metrics and feature importance for regulatory documentation.
  • Integrate a real-time data pipeline (Kafka or Airflow) so the platform ingests live transaction data rather than batch uploads.
  • Implement model drift detection — PSI (Population Stability Index) monitoring on score distributions over time.

Built by Om Patel — ML Engineer & Data Scientist.
Explore more projects on my Portfolio.

Previous Project All Projects Next Project